Rectified Factor Networks for Biclustering
نویسندگان
چکیده
Biclustering is evolving into one of the major tools for analyzing large datasets given as matrix of samples times features. Biclustering has several noteworthy applications and has been successfully applied in life sciences and e-commerce for drug design and recommender systems, respectively. FABIA is one of the most successful biclustering methods and is used by companies like Bayer, Janssen, or Zalando. FABIA is a generative model that represents each bicluster by two sparse membership vectors: one for the samples and one for the features. However, FABIA is restricted to about 20 code units because of the high computational complexity of computing the posterior. Furthermore, code units are sometimes insufficiently decorrelated. Sample membership is difficult to determine because vectors do not have exact zero entries and can have both large positive and large negative values. We propose to use the recently introduced unsupervised Deep Learning approach Rectified Factor Networks (RFNs) to overcome the drawbacks of existing biclustering methods. RFNs efficiently construct very sparse, non-linear, highdimensional representations of the input via their posterior means. RFN learning is a generalized alternating minimization algorithm based on the posterior regularization method which enforces non-negative and normalized posterior means. Each code unit represents a bicluster, where samples for which the code unit is active belong to the bicluster and features that have activating weights to the code unit belong to the bicluster. On 400 benchmark datasets with artificially implanted biclusters, RFN significantly outperformed 13 other biclustering competitors including FABIA. In biclustering experiments on three gene expression datasets with known clusters that were determined by separate measurements, RFN biclustering was two times significantly better than the other 13 methods and once on second place. On data of the 1000 Genomes Project, RFN could identify DNA segments which indicate, that interbreeding with other hominins starting already before ancestors of modern humans left Africa.
منابع مشابه
Rectified factor networks for biclustering of omics data
Motivation Biclustering has become a major tool for analyzing large datasets given as matrix of samples times features and has been successfully applied in life sciences and e-commerce for drug design and recommender systems, respectively. actor nalysis for cluster cquisition (FABIA), one of the most successful biclustering methods, is a generative model that represents each bicluster by two sp...
متن کاملRectified Factor Networks
We propose rectified factor networks (RFNs) as generative unsupervised models, which learn robust, very sparse, and non-linear codes with many code units. RFN learning is a variational expectation maximization (EM) algorithm with unknown prior which includes (i) rectified posterior means, (ii) normalized signals of hidden units, and (iii) dropout. Like factor analysis, RFNs explain the data var...
متن کاملGene co-expression networks via biclustering Differential gene co-expression networks via Bayesian biclustering models
Identifying latent structure in large data matrices is essential for exploring biological processes. Here, we consider recovering gene co-expression networks from gene expression data, where each network encodes relationships between genes that are locally co-regulated by shared biological mechanisms. To do this, we develop a Bayesian statistical model for biclustering to infer subsets of co-re...
متن کاملContext Specific and Differential Gene Co-expression Networks via Bayesian Biclustering
Identifying latent structure in high-dimensional genomic data is essential for exploring biological processes. Here, we consider recovering gene co-expression networks from gene expression data, where each network encodes relationships between genes that are co-regulated by shared biological mechanisms. To do this, we develop a Bayesian statistical model for biclustering to infer subsets of co-...
متن کاملPredicting Binding of Transcriptional Regulators with a Two-Way Latent Grouping Model
Binding of transcription factors to the promoter regions of genes can be measured genome-wide to reveal regulatory networks. The measurements are expensive, however, and methods for predicting bindings from earlier data would reduce the cost. At best, genome-wide studies could be targeted based on a few test samples. We model the binding patterns using recent ideas from collaborative filtering ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016